Shared memory multiprocessors

نویسنده

  • Leonid Ryzhyk
چکیده

The hardware evolution has reached the point where it becomes extremely difficult to further improve the performance of superscalar processors by either exploiting more instruction-level parallelism (ILP) or using new semiconductor technologies. The effort to increase processor performance by exploiting ILP follows the law of diminishing returns: new, more complex optimisations tend to cost more in terms of silicon as well as design effort and provide smaller and smaller performance gains. In addition, aggressive use of speculative caching and execution techniques in modern superscalars leads to poor energy efficiency — an important concern in both embedded systems with limited battery capacity and in server systems, where heat dissipation is a problem of growing importance. The natural solution is to rely on thread-level parallelism (TLP) rather than ILP to further increase the computational power of computer systems. The following forms of TLP are currently being used: explicit multithreading, chip-level multiprocessing (CMP), symmetric multiprocessing (SMP), asymmetric multiprocessing (ASMP), non-uniform memory access multiprocessing (NUMA), and clustered multiprocessing. With the exception of clustered multiprocessors, all of the above architectures provide all cores in the system with access to a shared physical address space. The shared memory organisation has three major advantages over simpler private memory organisation. First, because in shared-memory systems communication does not have to interfere with computation and because access to shared memory can be streamlined using hardware caching, shared memory provides an extremely efficient low-latency high-bandwidth communication mechanism. Second, shared memory provides a natural communication abstraction well understood by most developers. Third, the shared memory organisation allows multithreaded or multiprocess applications developed for uniprocessors to run on shared-memory multiprocessors with minimal or no modifications. The goal of this report in to give an overview of issues and tradeoffs involved in memory hierarchy design for shared memory multiprocessors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling and Performance Evaluation of Multi-Processors Organization with Shared Memories

This paper is primarily concerned with theoretical evaluation of the performance of multiprocessors system. A markovian waiting line model has been developed for various different multi-processors configurations, with shared memory. The system is analysed at the request level rather than job level.

متن کامل

Experiences with Data Distribution on NUMA Shared Memory Multiprocessors

The choice of a good data distribution scheme is critical to performance of data-parallel applications on both distributed memory multiprocessors and NUMA shared memory multiprocessors. The high cost of interprocessor communication in distributed memory multiprocessors makes the minimization of communications the predominant issue in selecting data distributionschemes. However, on NUMA multipro...

متن کامل

Computation and Data Partitioning on Scalable Shared Memory Multiprocessors

In this paper we identify the factors that affect the derivation of computation and data partitions on scalable shared memory multiprocessors (SSMMs). We show that these factors necessitate an SSMM-conscious approach. In addition to remote memory access, which is the sole factor on distributed memory multiprocessors, cache affinity, memory contention and false sharing are important factors that...

متن کامل

Techniques for Module - Level Speculative Parallelization on Shared - Memory Multiprocessors Research Proposal

Multiprocessors have hit the mainstream and cover the whole spectrum of computational needs from small-scale symmetric multiprocessors to scalable distributed shared-memory systems with a few hundred processors. This has made it possible to boost the performance of a number of important applications from the numeric and database domain. Extending the scope of applications that can take advantag...

متن کامل

Automatic Localization for Distributed-Memory Multiprocessors Using a Shared-Memory Compilation Framework

In this paper, we outline an approach for compiling for distributed-memory multiprocessors that is inherited from compiler technologies for shared-memory multiprocessors. We believe that this approach to compiling for distributed-memory machines is promising because it is a logical extension of the shared-memory parallel programming model, a model that is easier for programmers to work with, an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006